@MastersThesis{Begliomini:2022:CyMoUr,
author = "Begliomini, Felipe Nincao",
title = "Cyanobacteria monitoring on urban reservoirs using hyperspectral
orbital remote sensing data and machine learning",
school = "Instituto Nacional de Pesquisas Espaciais (INPE)",
year = "2022",
address = "S{\~a}o Jos{\'e} dos Campos",
month = "2022-05-30",
keywords = "cyanobacteria, C-phycocyanin, remote sensing, PRISMA, machine
leaning, cianobact{\'e}rias, C-ficocianina, sensoriamento remoto,
aprendizado de m{\'a}quina.",
abstract = "Urban Reservoirs provide relevant ecosystem services to the
population worldwide. Although its recognized importance, there is
an increasing degradation trend of metropolitan water systems' due
to anthropical impacts. Cultural eutrophication is highlighted as
a negative effect of human activities, with severe consequences
such as the intensification of algae blooms. Cyanobacteria are the
most concerning bloom-forming species for inland waters due to the
environmental impacts and potential to produce toxic compounds.
Therefore, this research presents a state-of-art methodology for
monitoring Cyanobacteria based on orbital hyperspectral images and
Machine Learning Algorithms (MLA) in tropical urban reservoirs.
The photosynthetic pigment C-Phycocyanin (PC) was used as a proxy
for the Cyanobacteria biomass once this billiprotein is specific
from this algae group. Billings reservoir was chosen as the study
area due to the constant presence of Cyanobacteria and its
importance to the regional urban water supply. Eight field
campaigns were made for collecting radiometric, photosynthetic
pigments, and taxonomical samples. A hyperspectral image from the
PRISMA was acquired in matchup condition, and tree atmospheric
correction algorithms were assessed (ASI, ACOLITE, and 6SV).
Synthetic multispectral Landsat-8/OLI and Worldview-3 images were
generated from PRISMAs best surface reflectance product. Random
Forest (RF), Extreme Gradient Boost (XgBOOST), and Support Vector
Machine (SVM) were chosen to retrieve PC from Remote Sensing data.
Previously published PC algorithms, Normalized Index, and Line
Heights were generated from resampled in-situ radiometry for each
sensor. A data-driven feature selection followed by a
decorrelation procedure was used to identify the most informative
layers. The Grid Search algorithm tuned the hyperparameters. PC
was modeled from in-situ data through Monte Carlo simulations for
all assessed sensors and MLA. Then, the best combinations were
used for mapping PC in the hyperspectral and synthetic
multispectral images. The results for in-situ and orbital modeling
were compared with the state-of-art PC algorithm Mixture Density
Network (MDN) (OSHEA et al., 2021). PC from 0 to 301.81
\μg/L were found, with mean and median values of 20.28 and
2.9 \μg/L. Cyanobacteria species were at least abundant in
96% of the taxonomical samples. ASI was the best surface
reflectance product (MAE < 20% for the visible spectrum). ACOLITE
and 6SV underperformed ASIs product by two to ten folds. MDN has
sharply overestimated PC in both orbital and in-situ assessments.
RF had the best estimates for all assessed sensors using in-situ
data, with MAE ranging from 59-86%. The best result from orbital
data was achieved by PRISMA/RF (MAE = 45%). XgBOOST produced the
best results for Worldview-3 (MAE = 49%) and Landsat- 8/OLI (MAE =
74%) synthetic images. Those are the best-reported results for low
PC concentrations and reduced PC:Chla ratios. The low PC:Chla
ratios are also the most likely explanation for MDNs errors once
the model was trained with samples with 6 times higher the mean
PC:Chla found in this study. Specked noise was identified in
hyperspectral mapping and is probably due to the reduced
Signal-to-Noise ratio. More studies assessing PC in tropical
waters are recommended to understand the effects of different
latitudes on PC production. Finally, Landsat-8/OLI was identified
as the most feasible sensor for monitoring PC due to the
reasonable accuracy, the increased temporal resolution (8 days
with Landsat-9), and the free access data policy. RESUMO: Os
reservat{\'o}rios urbanos oferecem importantes servi{\c{c}}os
ecossist{\^e}micos. Contudo, esses sistemas aqu{\'a}ticos
t{\^e}m a qualidade de suas {\'a}guas impactada pela
antropiza{\c{c}}{\~a}o. A eutrofiza{\c{c}}{\~a}o cultural
{\'e} destacada como um efeito negativo das a{\c{c}}{\~o}es
humanas e intensifica a ocorr{\^e}ncia de flora{\c{c}}{\~o}es
de algas. As Cianobact{\'e}rias s{\~a}o as esp{\'e}cies
formadoras de flora{\c{c}}{\~o}es mais preocupantes em
{\'a}guas continentais devido aos impactos ambientais causados e
o potencial para produzir compostos t{\'o}xicos. Portanto, esse
estudo apresenta uma metodologia para monitorar
Cianobact{\'e}rias por meio de imagens orbitais hiperespectrais e
Algoritmos de Aprendizado de M{\'a}quina (AAM) em
reservat{\'o}rios tropicais urbanos. O pigmento
fotossint{\'e}tico C-Ficocianina (PC) foi usado como proxy para a
biomassa de Cianobact{\'e}rias. O reservat{\'o}rio Billings
serviu como {\'a}rea de estudo devido {\`a} presen{\c{c}}a
constante de Cianobact{\'e}rias e o uso para o abastecimento
p{\'u}blico. Oito campanhas foram realizadas para coletar dados
radiom{\'e}tricos, pigmentos fotossintentizantes, e taxonomia.
Uma imagem hiperespectral do sensor PRISMA foi adquirida
concomitantemente com uma das amostragens, e tr{\^e}s algoritmos
de corre{\c{c}}{\~a}o atmosf{\'e}rica foram avaliados (ASI,
ACOLITE e 6SV). Imagens sint{\'e}ticas dos sensores Landsat-8/OLI
e Worldview-3 foram geradas pelo melhor produto de
reflect{\^a}ncia de superf{\'{\i}}cie do sensor PRISMA. Random
Forest (RF), Extreme Gradient Boost (XgBOOST), e Support Vector
Machine (SVM) foram escolhidos para modelar a PC. Algoritmos de
PC, {\'{\I}}ndices Normalizados, e Line Heights foram gerados
por meio de dados radiom{\'e}tricos reamostrados para cada
sensor. Uma metodologia de sele{\c{c}}{\~a}o de atributos
baseada em dados foi utilizada para selecionar as
fei{\c{c}}{\~o}es mais informativas. O algoritmo Grid Search foi
aplicado para ajustar os hiperpar{\^a}metros. A PC foi modelada
com dados de campo por meio de Simula{\c{c}}{\~o}es Monte Carlo
para todos os sensores e AAM avaliados. As melhores
combina{\c{c}}{\~o}es foram usadas para mapear a PC nas imagens
multiespectrais sint{\'e}ticas e na hiperespectral. Os resultados
foram comparados com o algoritmo Mixture Density Network (MDN)
(OSHEA et al., 2021). Foram encontrados valores de PC entre 0 to
301,81 \μg/L, com uma m{\'e}dia e mediana de 20,28 e 2,9
\μg/L. As Cianobact{\'e}rias foram pelo menos abundantes em
96% das amostras taxon{\^o}micas. A ASI teve o melhor produto de
reflect{\^a}ncia de superf{\'{\i}}cie (MAE < 20% para o
espectro do vis{\'{\i}}vel). ACOLITE e 6SV tiveram resultados de
duas a dez vezes piores que o da ASI. O MDN superestimou os
valores de PC tanto nas an{\'a}lises in-situ como orbitais. O RF
obteve as melhores estimativas para todos os sensores com dados
in-situ, com MAE entre 59- 86%. O melhor resultado para dados
orbitais foi obtido pelo PRISMA/RF (MAE = 45%). O XgBOOST teve os
melhores resultados para as imagens sint{\'e}ticas do Worldview-3
e (MAE = 49%) e Landsat-8/OLI (MAE = 74%). Esses s{\~a}o os
melhores resultados reportados para baixas
concentra{\c{c}}{\~o}es de PC e baixas raz{\~o}es PC:Chla. A
raz{\~a}o PC:Chla tamb{\'e}m {\'e} a melhor
explica{\c{c}}{\~a}o para os erros do MDN, uma vez que o modelo
foi treinado com amostras 6 vezes maiores do que a PC:Chla deste
estudo. Mais estudos avaliando a PC em {\'a}guas tropicais devem
ser realizados para entender o impacto de diferentes latitudes na
produ{\c{c}}{\~a}o de PC. Finalmente, o sensor Landsat-8/OLI foi
identificado com o sensor mais adequado para o monitoramento de PC
devido suas m{\'e}tricas de predi{\c{c}}{\~a}o razo{\'a}veis,
alta resolu{\c{c}}{\~a}o temporal e acesso de dados gratuito.",
committee = "Novo, Evlyn M{\'a}rcia Le{\~a}o de Moraes (presidente) and
Barbosa, Cl{\'a}udio Clemente Faria (orientador) and Martins,
Vitor Souza (orientador) and Ciotti, {\'A}urea Maria and Nordi,
Cristina Souza Freire and Lamparelli, Marta Cond{\'e}",
englishtitle = "Monitoramento de cianobact{\'e}rias em reservat{\'o}rios urbanos
utlilizando dados orbitais de sensoriamento remoto e algoritmos de
aprendizado de m{\'a}quina",
language = "en",
pages = "88",
ibi = "8JMKD3MGP3W34T/474PTSB",
url = "http://urlib.net/ibi/8JMKD3MGP3W34T/474PTSB",
targetfile = "publicacao.pdf",
urlaccessdate = "03 maio 2024"
}